Skip to content

feat(email): daily-driver UI pass — pre-scan card, in-chat Connect, session prefs#995

Open
itomek wants to merge 3 commits intomainfrom
tmi/amazing-villani-7b995a
Open

feat(email): daily-driver UI pass — pre-scan card, in-chat Connect, session prefs#995
itomek wants to merge 3 commits intomainfrom
tmi/amazing-villani-7b995a

Conversation

@itomek
Copy link
Copy Markdown
Collaborator

@itomek itomek commented May 8, 2026

Summary

Closes the four UI gaps that turn the shipped EmailTriageAgent into something a user opens daily: structured pre-scan card with inline action buttons, in-chat Connect Google CTA, session-scoped triage preferences, and the adversarial-review fixes that emerged in the same pass.

Why

Before: picking Email Triage in Agent UI returns prose from "Triage my inbox" — a CLI experience pasted into chat, no scannable summary, no inline actions, OAuth requires leaving for Settings. After: a typed email_pre_scan envelope mounts as a triage card with three sections (urgent / actionable / suggested archives) and per-row Approve / Reply / Open / Dismiss buttons; an inline Connect Google CTA renders when the agent surfaces an auth-required error; in-session sender preferences (set_priority_sender, set_low_priority_sender, set_category_default) are honored by both triage_inbox and pre_scan_inbox. Adversarial review caught a P0 rendering bug (stripBogusCodeFences was eating the email_pre_scan fence before the markdown renderer saw it, so the card would never have mounted), plus a subject-text prompt-injection path through Approve dispatch and a phishing-override safety hole — all four fixed in this same pass.

Linked issue

Refs #645

Test plan

  • pytest tests/unit/agents/test_email_agent*.py tests/unit/test_email_cli.py → 132 passed (130 baseline + 2 new — phishing-override + system-prompt canary).
  • python util/lint.py --all clean.
  • npx tsc --noEmit + npm run build in src/gaia/apps/webui clean.
  • v0.17.6 installer smoke on t-nx-strx-halo (Ubuntu 24.04, Strix Halo): AppImage and deb paths both green end-to-end (shim creation, gaia --version, gaia connectors list, deb upgrade 0.17.1 → 0.17.6, apt remove gaia cleanup).

Branch strategy

Drafted intentionally — depends on Kalin's memory + 5-agent split PR landing on main first (per 2026-05-07 scrum). The in-memory _session_preferences schema is designed to migrate cleanly to that store: init_session_preferences() becomes a loader, the mutator tools write through, kwarg signatures on triage_inbox_impl / pre_scan_inbox_impl don't change.

Checklist

  • I have linked a GitHub issue above (Refs #645).
  • I have described why this change is being made, not just what changed.
  • I have run linting and tests locally (python util/lint.py --all, pytest tests/unit/).
  • I have updated documentation if user-visible behavior changed.

…ession prefs

Closes the four UI gaps that turn the shipped EmailTriageAgent (#965) into a
tool a user actually opens daily. "Run a pre-scan" now returns a typed
``email_pre_scan`` envelope that the chat surface mounts as a structured
triage card with three sections (urgent / actionable / suggested archives) and
inline Approve / Reply / Open / Dismiss buttons — no chat-turn required to act
on each item. When the agent surfaces an OAuth auth-required error, an inline
``Connect Google`` CTA renders in the same message bubble (no Settings nav).
Session-scoped triage preferences (``set_priority_sender``,
``set_low_priority_sender``, ``set_category_default``,
``clear_session_preferences``) are honored by ``triage_inbox`` and
``pre_scan_inbox`` for the lifetime of the agent — wiped on restart by design,
clean migration path to the persistent memory subsystem when that lands.

Adversarial review caught a showstopper before merge: ``email_pre_scan`` was
being stripped by ``stripBogusCodeFences`` in MessageBubble before the markdown
renderer saw it, so the card would never have mounted in production. Fixed in
the same pass with three other findings: id-only Approve/Reply dispatch
(closes a prompt-injection path through email subject text), phishing-aware
preference application (a phishing-flagged message bypasses both priority and
low-priority overrides), and a single-flight per-card guard against double-
click duplicate dispatches.

Drafted intentionally — depends on Kalin's memory + 5-agent split PR landing
on main first. The in-memory ``_session_preferences`` schema is designed to
migrate to that store cleanly without touching the agent or read-tool kwargs.

Tests: 132 email unit tests pass (130 baseline + 2 new — phishing-override
and a system-prompt canary asserting the ``email_pre_scan`` instruction is
still present). Lint clean, TypeScript clean, webui build clean. v0.17.6
installer smoke verified end-to-end on t-nx-strx-halo (Ubuntu 24.04, Strix
Halo) for both AppImage and deb paths.

Ref: #645
@github-actions github-actions Bot added documentation Documentation changes tests Test changes electron Electron app changes agents labels May 8, 2026
@itomek itomek self-assigned this May 8, 2026
… mount)

Live-testing PR #995 in the Agent UI on Gemma-4-E4B exposed exactly the
failure mode three of the six adversarial-reflection agents had predicted:
the LLM paraphrased the ``pre_scan_inbox`` envelope into prose ("Here is
a summary of your inbox status…") instead of echoing it verbatim inside
the ``email_pre_scan`` fence the system prompt told it to emit. The tool
fired correctly with real data; the structured envelope was sitting in
the agent-step debug panel; the assistant message was prose; the card
never mounted. Conditioning a card-rendering contract on LLM compliance
is fragile by construction — the data is server-side already, so the
LLM doesn't need to relay it.

This patch moves the fence emission to the SSE handler:

- ``SSEOutputHandler._capture_render_payload`` watches ``pretty_print_json``
  for ``pre_scan_inbox`` results and buffers the inner ``data`` envelope
  on the handler. Handles both dict and JSON-string payloads — the
  ``@tool`` dispatch hands strings to ``pretty_print_json``, which the
  initial implementation missed (caught by retest, fixed before commit).
- ``print_final_answer`` drains the buffer and prepends each pending
  payload as a ``\`\`\`<lang>\\n<json>\\n\`\`\``` block before the LLM's
  prose. The frontend's existing ``pre`` markdown override (PR #995)
  detects the language tag and mounts ``EmailPreScanCard``. Mount is
  deterministic — no longer depends on LLM token-level compliance.
- The email agent's system prompt drops the brittle "you MUST emit a
  fenced ``email_pre_scan`` block" instruction. Replaced with a single
  sentence telling the LLM the user has already been shown the card and
  to write a brief framing line. Smaller prompt, harder to drift.

This is a deliberate hack — the right fix is a tool-use-tuned model
handling structured emission while a chat-tuned model handles the prose
summary, via Lemonade's multi-model loading. Tracked in a follow-up
issue (linked in the buffer's docstring); removing this hack is part of
that issue's acceptance criteria. The hack is keyed by a class-level
``_RENDER_TOOL_TO_LANG`` map so future structured-render tools (calendar
cards, doc-summary cards) plug into the same pathway by adding one line.

Verified live in the Agent UI on Gemma-4-E4B with my real Gmail: card
mounted on first try with 5 actionable items (Kalin, Chaked, Maximillian,
ObjectWin Accounts), 1 suggested archive (Google Gemini), 17 informational
counted in the header, plus the LLM's one-line prose framing below.

Tests: 328 unit tests pass total. Added 9 cases in
``test_sse_handler.py::TestStructuredRenderInjection`` covering envelope
capture (dict + JSON-string + malformed), fence-prepend ordering,
empty-prose path, non-pre-scan tools passthrough, ok=False rejection,
wrong-kind rejection, buffer drained between turns. The system-prompt
canary test loosened to assert ``pre_scan_inbox`` only (the
``email_pre_scan`` literal is no longer in the prompt — it lives in the
SSE hook now).
…1000

The fence-injection commit (f0212e2) referenced ``#997`` for the
multi-model follow-up. The actual issue landed as #1000. Update all
docstrings, comments, and the canary test to point at the right number.

No behavioral change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents documentation Documentation changes electron Electron app changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant